Dataset statistics
| Dataset A | Dataset B | |
|---|---|---|
| Number of variables | 12 | 12 |
| Number of observations | 446 | 446 |
| Missing cells | 432 | 439 |
| Missing cells (%) | 8.1% | 8.2% |
| Duplicate rows | 0 | 0 |
| Duplicate rows (%) | 0.0% | 0.0% |
| Total size in memory | 45.3 KiB | 45.3 KiB |
| Average record size in memory | 104.0 B | 104.0 B |
Variable types
| Dataset A | Dataset B | |
|---|---|---|
| Numeric | 5 | 5 |
| Categorical | 4 | 4 |
| Text | 3 | 3 |
| Dataset A | Dataset B | |
|---|---|---|
Age has 81 (18.2%) missing values | Age has 101 (22.6%) missing values | Missing |
Cabin has 351 (78.7%) missing values | Cabin has 337 (75.6%) missing values | Missing |
PassengerId has unique values | PassengerId has unique values | Unique |
Name has unique values | Name has unique values | Unique |
SibSp has 306 (68.6%) zeros | SibSp has 308 (69.1%) zeros | Zeros |
Parch has 336 (75.3%) zeros | Parch has 338 (75.8%) zeros | Zeros |
Fare has 5 (1.1%) zeros | Fare has 9 (2.0%) zeros | Zeros |
Reproduction
| Dataset A | Dataset B | |
|---|---|---|
| Analysis started | 2024-05-07 19:22:23.903915 | 2024-05-07 19:22:27.789959 |
| Analysis finished | 2024-05-07 19:22:27.788810 | 2024-05-07 19:22:31.810183 |
| Duration | 3.88 seconds | 4.02 seconds |
| Software version | ydata-profiling v0.0.dev0 | ydata-profiling v0.0.dev0 |
| Download configuration | config.json | config.json |
PassengerId
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 438.5583 | 438.30269 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 2 | 2 |
| Maximum | 889 | 891 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 2 | 2 |
| 5-th percentile | 44.5 | 38.5 |
| Q1 | 220.5 | 202.25 |
| median | 435.5 | 443.5 |
| Q3 | 653.75 | 671.5 |
| 95-th percentile | 848.75 | 840.25 |
| Maximum | 889 | 891 |
| Range | 887 | 889 |
| Interquartile range (IQR) | 433.25 | 469.25 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 256.34908 | 261.9855 |
| Coefficient of variation (CV) | 0.5845268 | 0.59772733 |
| Kurtosis | -1.1315409 | -1.2588459 |
| Mean | 438.5583 | 438.30269 |
| Median Absolute Deviation (MAD) | 218 | 236 |
| Skewness | 0.055681034 | 0.0070747928 |
| Sum | 195597 | 195483 |
| Variance | 65714.849 | 68636.4 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 702 | 1 | 0.2% |
| 499 | 1 | 0.2% |
| 176 | 1 | 0.2% |
| 32 | 1 | 0.2% |
| 433 | 1 | 0.2% |
| 733 | 1 | 0.2% |
| 785 | 1 | 0.2% |
| 412 | 1 | 0.2% |
| 449 | 1 | 0.2% |
| 256 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 690 | 1 | 0.2% |
| 655 | 1 | 0.2% |
| 2 | 1 | 0.2% |
| 610 | 1 | 0.2% |
| 745 | 1 | 0.2% |
| 228 | 1 | 0.2% |
| 846 | 1 | 0.2% |
| 841 | 1 | 0.2% |
| 619 | 1 | 0.2% |
| 90 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 11 | 1 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 6 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 14 | 1 | |
| 20 | 1 | |
| 21 | 1 | |
| 22 | 1 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 6 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 14 | 1 | |
| 20 | 1 | |
| 21 | 1 | |
| 22 | 1 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 11 | 1 |
Survived
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 0 | |
|---|---|
| 1 |
| 0 | |
|---|---|
| 1 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 2 | 2 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 1 | 1 |
| 2nd row | 0 | 0 |
| 3rd row | 1 | 1 |
| 4th row | 1 | 0 |
| 5th row | 0 | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 273 | |
| 1 | 173 |
| Value | Count | Frequency (%) |
| 0 | 259 | |
| 1 | 187 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 0 | 273 | |
| 1 | 173 |
| Value | Count | Frequency (%) |
| 0 | 259 | |
| 1 | 187 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 273 | |
| 1 | 173 |
| Value | Count | Frequency (%) |
| 0 | 259 | |
| 1 | 187 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 273 | |
| 1 | 173 |
| Value | Count | Frequency (%) |
| 0 | 259 | |
| 1 | 187 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 273 | |
| 1 | 173 |
| Value | Count | Frequency (%) |
| 0 | 259 | |
| 1 | 187 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 273 | |
| 1 | 173 |
| Value | Count | Frequency (%) |
| 0 | 259 | |
| 1 | 187 |
Pclass
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 3 | |
|---|---|
| 1 | |
| 2 |
| 3 | |
|---|---|
| 1 | |
| 2 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 1 | 1 |
| 2nd row | 3 | 3 |
| 3rd row | 2 | 3 |
| 4th row | 1 | 3 |
| 5th row | 3 | 3 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 248 | |
| 1 | 101 | |
| 2 | 97 | 21.7% |
| Value | Count | Frequency (%) |
| 3 | 249 | |
| 1 | 113 | |
| 2 | 84 | 18.8% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 3 | 248 | |
| 1 | 101 | |
| 2 | 97 | 21.7% |
| Value | Count | Frequency (%) |
| 3 | 249 | |
| 1 | 113 | |
| 2 | 84 | 18.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 248 | |
| 1 | 101 | |
| 2 | 97 | 21.7% |
| Value | Count | Frequency (%) |
| 3 | 249 | |
| 1 | 113 | |
| 2 | 84 | 18.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 248 | |
| 1 | 101 | |
| 2 | 97 | 21.7% |
| Value | Count | Frequency (%) |
| 3 | 249 | |
| 1 | 113 | |
| 2 | 84 | 18.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 248 | |
| 1 | 101 | |
| 2 | 97 | 21.7% |
| Value | Count | Frequency (%) |
| 3 | 249 | |
| 1 | 113 | |
| 2 | 84 | 18.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 248 | |
| 1 | 101 | |
| 2 | 97 | 21.7% |
| Value | Count | Frequency (%) |
| 3 | 249 | |
| 1 | 113 | |
| 2 | 84 | 18.8% |
Name
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 82 | 82 |
| Median length | 50 | 49 |
| Mean length | 27.484305 | 27.004484 |
| Min length | 12 | 12 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 12258 | 12044 |
| Distinct characters | 59 | 60 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 446 | 446 ? |
| Unique (%) | 100.0% | 100.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | Silverthorne, Mr. Spencer Victor | Madill, Miss. Georgette Alexandra |
| 2nd row | Andersson, Miss. Sigrid Elisabeth | Palsson, Master. Gosta Leonard |
| 3rd row | Lehmann, Miss. Bertha | Coutts, Master. William Loch "William" |
| 4th row | Potter, Mrs. Thomas Jr (Lily Alexenia Wilson) | Ivanoff, Mr. Kanio |
| 5th row | Vander Planke, Mrs. Julius (Emelia Maria Vandemoortele) | Lulic, Mr. Nikola |
| Value | Count | Frequency (%) |
| mr | 250 | 13.5% |
| miss | 99 | 5.4% |
| mrs | 65 | 3.5% |
| william | 30 | 1.6% |
| john | 24 | 1.3% |
| master | 20 | 1.1% |
| henry | 19 | 1.0% |
| mary | 13 | 0.7% |
| james | 13 | 0.7% |
| charles | 11 | 0.6% |
| Other values (900) | 1303 |
| Value | Count | Frequency (%) |
| mr | 262 | 14.3% |
| miss | 92 | 5.0% |
| mrs | 67 | 3.7% |
| william | 29 | 1.6% |
| john | 28 | 1.5% |
| master | 19 | 1.0% |
| james | 18 | 1.0% |
| henry | 15 | 0.8% |
| george | 14 | 0.8% |
| johan | 10 | 0.5% |
| Other values (884) | 1273 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1402 | 11.4% | |
| r | 971 | 7.9% |
| e | 854 | 7.0% |
| a | 824 | 6.7% |
| s | 687 | 5.6% |
| n | 682 | 5.6% |
| i | 674 | 5.5% |
| l | 568 | 4.6% |
| M | 555 | 4.5% |
| o | 516 | 4.2% |
| Other values (49) | 4525 |
| Value | Count | Frequency (%) |
| 1381 | 11.5% | |
| r | 966 | 8.0% |
| e | 862 | 7.2% |
| a | 834 | 6.9% |
| n | 676 | 5.6% |
| i | 629 | 5.2% |
| s | 629 | 5.2% |
| M | 567 | 4.7% |
| o | 526 | 4.4% |
| l | 515 | 4.3% |
| Other values (50) | 4459 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 12258 |
| Value | Count | Frequency (%) |
| (unknown) | 12044 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1402 | 11.4% | |
| r | 971 | 7.9% |
| e | 854 | 7.0% |
| a | 824 | 6.7% |
| s | 687 | 5.6% |
| n | 682 | 5.6% |
| i | 674 | 5.5% |
| l | 568 | 4.6% |
| M | 555 | 4.5% |
| o | 516 | 4.2% |
| Other values (49) | 4525 |
| Value | Count | Frequency (%) |
| 1381 | 11.5% | |
| r | 966 | 8.0% |
| e | 862 | 7.2% |
| a | 834 | 6.9% |
| n | 676 | 5.6% |
| i | 629 | 5.2% |
| s | 629 | 5.2% |
| M | 567 | 4.7% |
| o | 526 | 4.4% |
| l | 515 | 4.3% |
| Other values (50) | 4459 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 12258 |
| Value | Count | Frequency (%) |
| (unknown) | 12044 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1402 | 11.4% | |
| r | 971 | 7.9% |
| e | 854 | 7.0% |
| a | 824 | 6.7% |
| s | 687 | 5.6% |
| n | 682 | 5.6% |
| i | 674 | 5.5% |
| l | 568 | 4.6% |
| M | 555 | 4.5% |
| o | 516 | 4.2% |
| Other values (49) | 4525 |
| Value | Count | Frequency (%) |
| 1381 | 11.5% | |
| r | 966 | 8.0% |
| e | 862 | 7.2% |
| a | 834 | 6.9% |
| n | 676 | 5.6% |
| i | 629 | 5.2% |
| s | 629 | 5.2% |
| M | 567 | 4.7% |
| o | 526 | 4.4% |
| l | 515 | 4.3% |
| Other values (50) | 4459 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 12258 |
| Value | Count | Frequency (%) |
| (unknown) | 12044 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1402 | 11.4% | |
| r | 971 | 7.9% |
| e | 854 | 7.0% |
| a | 824 | 6.7% |
| s | 687 | 5.6% |
| n | 682 | 5.6% |
| i | 674 | 5.5% |
| l | 568 | 4.6% |
| M | 555 | 4.5% |
| o | 516 | 4.2% |
| Other values (49) | 4525 |
| Value | Count | Frequency (%) |
| 1381 | 11.5% | |
| r | 966 | 8.0% |
| e | 862 | 7.2% |
| a | 834 | 6.9% |
| n | 676 | 5.6% |
| i | 629 | 5.2% |
| s | 629 | 5.2% |
| M | 567 | 4.7% |
| o | 526 | 4.4% |
| l | 515 | 4.3% |
| Other values (50) | 4459 |
Sex
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| male | |
|---|---|
| female |
| male | |
|---|---|
| female |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 6 | 6 |
| Median length | 4 | 4 |
| Mean length | 4.7443946 | 4.7174888 |
| Min length | 4 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 2116 | 2104 |
| Distinct characters | 5 | 5 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | male | female |
| 2nd row | female | male |
| 3rd row | female | male |
| 4th row | female | male |
| 5th row | female | male |
Common Values
| Value | Count | Frequency (%) |
| male | 280 | |
| female | 166 |
| Value | Count | Frequency (%) |
| male | 286 | |
| female | 160 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| male | 280 | |
| female | 166 |
| Value | Count | Frequency (%) |
| male | 286 | |
| female | 160 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 612 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 166 | 7.8% |
| Value | Count | Frequency (%) |
| e | 606 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 160 | 7.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2116 |
| Value | Count | Frequency (%) |
| (unknown) | 2104 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 612 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 166 | 7.8% |
| Value | Count | Frequency (%) |
| e | 606 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 160 | 7.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2116 |
| Value | Count | Frequency (%) |
| (unknown) | 2104 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 612 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 166 | 7.8% |
| Value | Count | Frequency (%) |
| e | 606 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 160 | 7.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2116 |
| Value | Count | Frequency (%) |
| (unknown) | 2104 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 612 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 166 | 7.8% |
| Value | Count | Frequency (%) |
| e | 606 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 160 | 7.6% |
Age
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 77 | 73 |
| Distinct (%) | 21.1% | 21.2% |
| Missing | 81 | 101 |
| Missing (%) | 18.2% | 22.6% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 29.448411 | 28.481652 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.42 | 0.42 |
| Maximum | 74 | 74 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.42 | 0.42 |
| 5-th percentile | 4 | 4 |
| Q1 | 19 | 20 |
| median | 28.5 | 27 |
| Q3 | 39 | 36 |
| 95-th percentile | 55 | 54 |
| Maximum | 74 | 74 |
| Range | 73.58 | 73.58 |
| Interquartile range (IQR) | 20 | 16 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 14.523908 | 13.978708 |
| Coefficient of variation (CV) | 0.49319837 | 0.49079695 |
| Kurtosis | -0.052022994 | 0.47179103 |
| Mean | 29.448411 | 28.481652 |
| Median Absolute Deviation (MAD) | 9.5 | 8 |
| Skewness | 0.27915925 | 0.44265153 |
| Sum | 10748.67 | 9826.17 |
| Variance | 210.94392 | 195.40427 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 18 | 15 | 3.4% |
| 25 | 14 | 3.1% |
| 28 | 14 | 3.1% |
| 21 | 14 | 3.1% |
| 19 | 13 | 2.9% |
| 36 | 12 | 2.7% |
| 16 | 12 | 2.7% |
| 35 | 12 | 2.7% |
| 39 | 12 | 2.7% |
| 30 | 12 | 2.7% |
| Other values (67) | 235 | |
| (Missing) | 81 | 18.2% |
| Value | Count | Frequency (%) |
| 24 | 18 | 4.0% |
| 21 | 15 | 3.4% |
| 27 | 13 | 2.9% |
| 22 | 13 | 2.9% |
| 18 | 13 | 2.9% |
| 16 | 12 | 2.7% |
| 19 | 12 | 2.7% |
| 36 | 12 | 2.7% |
| 30 | 11 | 2.5% |
| 26 | 11 | 2.5% |
| Other values (63) | 215 | |
| (Missing) | 101 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.75 | 2 | 0.4% |
| 0.83 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 1 | 0.2% |
| 2 | 5 | |
| 3 | 4 | |
| 4 | 7 | |
| 5 | 4 | |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.67 | 1 | 0.2% |
| 0.75 | 1 | 0.2% |
| 0.83 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 5 | |
| 3 | 3 | |
| 4 | 6 | |
| 5 | 4 | |
| 6 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.67 | 1 | 0.2% |
| 0.75 | 1 | 0.2% |
| 0.83 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 5 | |
| 3 | 3 | |
| 4 | 6 | |
| 5 | 4 | |
| 6 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.75 | 2 | 0.4% |
| 0.83 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 1 | 0.2% |
| 2 | 5 | |
| 3 | 4 | |
| 4 | 7 | |
| 5 | 4 | |
| 6 | 1 | 0.2% |
SibSp
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 6 | 7 |
| Distinct (%) | 1.3% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.4573991 | 0.47757848 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 5 | 8 |
| Zeros | 306 | 308 |
| Zeros (%) | 68.6% | 69.1% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 1 | 1 |
| 95-th percentile | 2 | 2 |
| Maximum | 5 | 8 |
| Range | 5 | 8 |
| Interquartile range (IQR) | 1 | 1 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 0.8541907 | 1.0507095 |
| Coefficient of variation (CV) | 1.8674954 | 2.2000772 |
| Kurtosis | 7.2758737 | 24.33517 |
| Mean | 0.4573991 | 0.47757848 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 2.534483 | 4.2861733 |
| Sum | 204 | 213 |
| Variance | 0.72964176 | 1.1039905 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 306 | |
| 1 | 106 | 23.8% |
| 2 | 16 | 3.6% |
| 4 | 10 | 2.2% |
| 3 | 7 | 1.6% |
| 5 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 308 | |
| 1 | 108 | 24.2% |
| 2 | 13 | 2.9% |
| 3 | 7 | 1.6% |
| 8 | 4 | 0.9% |
| 4 | 4 | 0.9% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 306 | |
| 1 | 106 | 23.8% |
| 2 | 16 | 3.6% |
| 3 | 7 | 1.6% |
| 4 | 10 | 2.2% |
| 5 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 308 | |
| 1 | 108 | 24.2% |
| 2 | 13 | 2.9% |
| 3 | 7 | 1.6% |
| 4 | 4 | 0.9% |
| 5 | 2 | 0.4% |
| 8 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 308 | |
| 1 | 108 | 24.2% |
| 2 | 13 | 2.9% |
| 3 | 7 | 1.6% |
| 4 | 4 | 0.9% |
| 5 | 2 | 0.4% |
| 8 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 306 | |
| 1 | 106 | 23.8% |
| 2 | 16 | 3.6% |
| 3 | 7 | 1.6% |
| 4 | 10 | 2.2% |
| 5 | 1 | 0.2% |
Parch
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 7 |
| Distinct (%) | 1.6% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.40807175 | 0.37892377 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 6 | 6 |
| Zeros | 336 | 338 |
| Zeros (%) | 75.3% | 75.8% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 0 | 0 |
| 95-th percentile | 2 | 2 |
| Maximum | 6 | 6 |
| Range | 6 | 6 |
| Interquartile range (IQR) | 0 | 0 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 0.86859171 | 0.81414722 |
| Coefficient of variation (CV) | 2.1285269 | 2.1485779 |
| Kurtosis | 10.822869 | 12.212103 |
| Mean | 0.40807175 | 0.37892377 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 2.9096457 | 3.0177092 |
| Sum | 182 | 169 |
| Variance | 0.75445155 | 0.66283569 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 336 | |
| 1 | 60 | 13.5% |
| 2 | 41 | 9.2% |
| 5 | 4 | 0.9% |
| 3 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 338 | |
| 1 | 65 | 14.6% |
| 2 | 35 | 7.8% |
| 5 | 3 | 0.7% |
| 3 | 3 | 0.7% |
| 4 | 1 | 0.2% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 336 | |
| 1 | 60 | 13.5% |
| 2 | 41 | 9.2% |
| 3 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 5 | 4 | 0.9% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 338 | |
| 1 | 65 | 14.6% |
| 2 | 35 | 7.8% |
| 3 | 3 | 0.7% |
| 4 | 1 | 0.2% |
| 5 | 3 | 0.7% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 338 | |
| 1 | 65 | 14.6% |
| 2 | 35 | 7.8% |
| 3 | 3 | 0.7% |
| 4 | 1 | 0.2% |
| 5 | 3 | 0.7% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 336 | |
| 1 | 60 | 13.5% |
| 2 | 41 | 9.2% |
| 3 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 5 | 4 | 0.9% |
| 6 | 1 | 0.2% |
Ticket
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 387 | 374 |
| Distinct (%) | 86.8% | 83.9% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 18 | 18 |
| Median length | 17 | 17 |
| Mean length | 6.6681614 | 6.7286996 |
| Min length | 3 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 2974 | 3001 |
| Distinct characters | 32 | 32 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 345 | 316 ? |
| Unique (%) | 77.4% | 70.9% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | PC 17475 | 24160 |
| 2nd row | 347082 | 349909 |
| 3rd row | SC 1748 | C.A. 37671 |
| 4th row | 11767 | 349201 |
| 5th row | 345763 | 315098 |
| Value | Count | Frequency (%) |
| pc | 27 | 4.9% |
| c.a | 15 | 2.7% |
| a/5 | 11 | 2.0% |
| ston/o | 6 | 1.1% |
| 2 | 6 | 1.1% |
| 347088 | 5 | 0.9% |
| 347082 | 5 | 0.9% |
| sc/paris | 4 | 0.7% |
| f.c.c | 4 | 0.7% |
| w./c | 4 | 0.7% |
| Other values (406) | 469 |
| Value | Count | Frequency (%) |
| pc | 36 | 6.3% |
| c.a | 17 | 3.0% |
| ca | 8 | 1.4% |
| a/5 | 8 | 1.4% |
| 2 | 6 | 1.0% |
| ston/o | 6 | 1.0% |
| w./c | 5 | 0.9% |
| 1601 | 4 | 0.7% |
| 2343 | 4 | 0.7% |
| soton/o.q | 4 | 0.7% |
| Other values (394) | 477 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 375 | |
| 1 | 366 | |
| 2 | 291 | |
| 7 | 267 | |
| 4 | 230 | |
| 0 | 216 | |
| 6 | 207 | 7.0% |
| 5 | 199 | 6.7% |
| 8 | 144 | 4.8% |
| 9 | 140 | 4.7% |
| Other values (22) | 539 |
| Value | Count | Frequency (%) |
| 3 | 360 | |
| 1 | 344 | |
| 2 | 288 | |
| 7 | 242 | 8.1% |
| 6 | 223 | 7.4% |
| 4 | 209 | 7.0% |
| 5 | 206 | 6.9% |
| 0 | 203 | 6.8% |
| 9 | 173 | 5.8% |
| 8 | 139 | 4.6% |
| Other values (22) | 614 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2974 |
| Value | Count | Frequency (%) |
| (unknown) | 3001 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 375 | |
| 1 | 366 | |
| 2 | 291 | |
| 7 | 267 | |
| 4 | 230 | |
| 0 | 216 | |
| 6 | 207 | 7.0% |
| 5 | 199 | 6.7% |
| 8 | 144 | 4.8% |
| 9 | 140 | 4.7% |
| Other values (22) | 539 |
| Value | Count | Frequency (%) |
| 3 | 360 | |
| 1 | 344 | |
| 2 | 288 | |
| 7 | 242 | 8.1% |
| 6 | 223 | 7.4% |
| 4 | 209 | 7.0% |
| 5 | 206 | 6.9% |
| 0 | 203 | 6.8% |
| 9 | 173 | 5.8% |
| 8 | 139 | 4.6% |
| Other values (22) | 614 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2974 |
| Value | Count | Frequency (%) |
| (unknown) | 3001 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 375 | |
| 1 | 366 | |
| 2 | 291 | |
| 7 | 267 | |
| 4 | 230 | |
| 0 | 216 | |
| 6 | 207 | 7.0% |
| 5 | 199 | 6.7% |
| 8 | 144 | 4.8% |
| 9 | 140 | 4.7% |
| Other values (22) | 539 |
| Value | Count | Frequency (%) |
| 3 | 360 | |
| 1 | 344 | |
| 2 | 288 | |
| 7 | 242 | 8.1% |
| 6 | 223 | 7.4% |
| 4 | 209 | 7.0% |
| 5 | 206 | 6.9% |
| 0 | 203 | 6.8% |
| 9 | 173 | 5.8% |
| 8 | 139 | 4.6% |
| Other values (22) | 614 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2974 |
| Value | Count | Frequency (%) |
| (unknown) | 3001 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 375 | |
| 1 | 366 | |
| 2 | 291 | |
| 7 | 267 | |
| 4 | 230 | |
| 0 | 216 | |
| 6 | 207 | 7.0% |
| 5 | 199 | 6.7% |
| 8 | 144 | 4.8% |
| 9 | 140 | 4.7% |
| Other values (22) | 539 |
| Value | Count | Frequency (%) |
| 3 | 360 | |
| 1 | 344 | |
| 2 | 288 | |
| 7 | 242 | 8.1% |
| 6 | 223 | 7.4% |
| 4 | 209 | 7.0% |
| 5 | 206 | 6.9% |
| 0 | 203 | 6.8% |
| 9 | 173 | 5.8% |
| 8 | 139 | 4.6% |
| Other values (22) | 614 |
Fare
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 179 | 176 |
| Distinct (%) | 40.1% | 39.5% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 29.944842 | 32.795645 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 512.3292 | 512.3292 |
| Zeros | 5 | 9 |
| Zeros (%) | 1.1% | 2.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 7.225 | 7.225 |
| Q1 | 7.925 | 7.8958 |
| median | 14.12915 | 14.5 |
| Q3 | 29.7 | 31.20625 |
| 95-th percentile | 108.28125 | 110.8833 |
| Maximum | 512.3292 | 512.3292 |
| Range | 512.3292 | 512.3292 |
| Interquartile range (IQR) | 21.775 | 23.31045 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 47.890684 | 49.611976 |
| Coefficient of variation (CV) | 1.5992966 | 1.5127611 |
| Kurtosis | 48.507288 | 25.339136 |
| Mean | 29.944842 | 32.795645 |
| Median Absolute Deviation (MAD) | 6.47915 | 6.9646 |
| Skewness | 5.8452184 | 4.1630847 |
| Sum | 13355.4 | 14626.858 |
| Variance | 2293.5176 | 2461.3482 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 13 | 25 | 5.6% |
| 26 | 18 | 4.0% |
| 7.75 | 17 | 3.8% |
| 8.05 | 17 | 3.8% |
| 7.8958 | 16 | 3.6% |
| 10.5 | 14 | 3.1% |
| 7.925 | 10 | 2.2% |
| 7.775 | 8 | 1.8% |
| 8.6625 | 8 | 1.8% |
| 7.8542 | 8 | 1.8% |
| Other values (169) | 305 |
| Value | Count | Frequency (%) |
| 13 | 19 | 4.3% |
| 8.05 | 19 | 4.3% |
| 7.8958 | 18 | 4.0% |
| 7.75 | 18 | 4.0% |
| 26 | 13 | 2.9% |
| 10.5 | 13 | 2.9% |
| 7.925 | 10 | 2.2% |
| 0 | 9 | 2.0% |
| 7.2292 | 9 | 2.0% |
| 7.775 | 9 | 2.0% |
| Other values (166) | 309 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 5 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 2 | 0.4% |
| 6.8583 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 4 |
| Value | Count | Frequency (%) |
| 0 | 9 | |
| 4.0125 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.4958 | 2 | 0.4% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 3 | 0.7% |
| 7.0542 | 1 | 0.2% |
| 7.1417 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 9 | |
| 4.0125 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.4958 | 2 | 0.4% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 3 | 0.7% |
| 7.0542 | 1 | 0.2% |
| 7.1417 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 5 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 2 | 0.4% |
| 6.8583 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 4 |
Cabin
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 78 | 91 |
| Distinct (%) | 82.1% | 83.5% |
| Missing | 351 | 337 |
| Missing (%) | 78.7% | 75.6% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 15 | 11 |
| Median length | 3 | 3 |
| Mean length | 3.5578947 | 3.4311927 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 338 | 374 |
| Distinct characters | 19 | 18 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 63 | 76 ? |
| Unique (%) | 66.3% | 69.7% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | E24 | B5 |
| 2nd row | C50 | E36 |
| 3rd row | A36 | C93 |
| 4th row | C128 | B78 |
| 5th row | A24 | C126 |
| Value | Count | Frequency (%) |
| f | 4 | 3.6% |
| b96 | 3 | 2.7% |
| c22 | 3 | 2.7% |
| c26 | 3 | 2.7% |
| b98 | 3 | 2.7% |
| e24 | 2 | 1.8% |
| e44 | 2 | 1.8% |
| e8 | 2 | 1.8% |
| g73 | 2 | 1.8% |
| c92 | 2 | 1.8% |
| Other values (77) | 85 |
| Value | Count | Frequency (%) |
| g6 | 4 | 3.3% |
| c25 | 3 | 2.4% |
| c27 | 3 | 2.4% |
| c23 | 3 | 2.4% |
| e24 | 2 | 1.6% |
| c126 | 2 | 1.6% |
| f4 | 2 | 1.6% |
| b60 | 2 | 1.6% |
| b58 | 2 | 1.6% |
| c26 | 2 | 1.6% |
| Other values (89) | 98 |
Most occurring characters
| Value | Count | Frequency (%) |
| B | 32 | 9.5% |
| 2 | 31 | 9.2% |
| C | 30 | 8.9% |
| 1 | 28 | 8.3% |
| 6 | 26 | 7.7% |
| 3 | 24 | 7.1% |
| 8 | 20 | 5.9% |
| 7 | 18 | 5.3% |
| E | 18 | 5.3% |
| 5 | 18 | 5.3% |
| Other values (9) | 93 |
| Value | Count | Frequency (%) |
| 2 | 44 | |
| C | 42 | |
| 1 | 33 | 8.8% |
| 5 | 27 | 7.2% |
| 3 | 25 | 6.7% |
| 6 | 24 | 6.4% |
| B | 24 | 6.4% |
| 4 | 22 | 5.9% |
| E | 19 | 5.1% |
| 0 | 19 | 5.1% |
| Other values (8) | 95 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 338 |
| Value | Count | Frequency (%) |
| (unknown) | 374 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| B | 32 | 9.5% |
| 2 | 31 | 9.2% |
| C | 30 | 8.9% |
| 1 | 28 | 8.3% |
| 6 | 26 | 7.7% |
| 3 | 24 | 7.1% |
| 8 | 20 | 5.9% |
| 7 | 18 | 5.3% |
| E | 18 | 5.3% |
| 5 | 18 | 5.3% |
| Other values (9) | 93 |
| Value | Count | Frequency (%) |
| 2 | 44 | |
| C | 42 | |
| 1 | 33 | 8.8% |
| 5 | 27 | 7.2% |
| 3 | 25 | 6.7% |
| 6 | 24 | 6.4% |
| B | 24 | 6.4% |
| 4 | 22 | 5.9% |
| E | 19 | 5.1% |
| 0 | 19 | 5.1% |
| Other values (8) | 95 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 338 |
| Value | Count | Frequency (%) |
| (unknown) | 374 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| B | 32 | 9.5% |
| 2 | 31 | 9.2% |
| C | 30 | 8.9% |
| 1 | 28 | 8.3% |
| 6 | 26 | 7.7% |
| 3 | 24 | 7.1% |
| 8 | 20 | 5.9% |
| 7 | 18 | 5.3% |
| E | 18 | 5.3% |
| 5 | 18 | 5.3% |
| Other values (9) | 93 |
| Value | Count | Frequency (%) |
| 2 | 44 | |
| C | 42 | |
| 1 | 33 | 8.8% |
| 5 | 27 | 7.2% |
| 3 | 25 | 6.7% |
| 6 | 24 | 6.4% |
| B | 24 | 6.4% |
| 4 | 22 | 5.9% |
| E | 19 | 5.1% |
| 0 | 19 | 5.1% |
| Other values (8) | 95 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 338 |
| Value | Count | Frequency (%) |
| (unknown) | 374 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| B | 32 | 9.5% |
| 2 | 31 | 9.2% |
| C | 30 | 8.9% |
| 1 | 28 | 8.3% |
| 6 | 26 | 7.7% |
| 3 | 24 | 7.1% |
| 8 | 20 | 5.9% |
| 7 | 18 | 5.3% |
| E | 18 | 5.3% |
| 5 | 18 | 5.3% |
| Other values (9) | 93 |
| Value | Count | Frequency (%) |
| 2 | 44 | |
| C | 42 | |
| 1 | 33 | 8.8% |
| 5 | 27 | 7.2% |
| 3 | 25 | 6.7% |
| 6 | 24 | 6.4% |
| B | 24 | 6.4% |
| 4 | 22 | 5.9% |
| E | 19 | 5.1% |
| 0 | 19 | 5.1% |
| Other values (8) | 95 |
Embarked
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 0 | 1 |
| Missing (%) | 0.0% | 0.2% |
| Memory size | 7.0 KiB | 7.0 KiB |
| S | |
|---|---|
| C | |
| Q |
| S | |
|---|---|
| C | |
| Q |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 445 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | S | S |
| 2nd row | S | S |
| 3rd row | C | S |
| 4th row | C | S |
| 5th row | S | S |
Common Values
| Value | Count | Frequency (%) |
| S | 328 | |
| C | 77 | 17.3% |
| Q | 41 | 9.2% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 89 | 20.0% |
| Q | 41 | 9.2% |
| (Missing) | 1 | 0.2% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| s | 328 | |
| c | 77 | 17.3% |
| q | 41 | 9.2% |
| Value | Count | Frequency (%) |
| s | 315 | |
| c | 89 | 20.0% |
| q | 41 | 9.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 328 | |
| C | 77 | 17.3% |
| Q | 41 | 9.2% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 89 | 20.0% |
| Q | 41 | 9.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 445 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| S | 328 | |
| C | 77 | 17.3% |
| Q | 41 | 9.2% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 89 | 20.0% |
| Q | 41 | 9.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 445 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| S | 328 | |
| C | 77 | 17.3% |
| Q | 41 | 9.2% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 89 | 20.0% |
| Q | 41 | 9.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 445 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| S | 328 | |
| C | 77 | 17.3% |
| Q | 41 | 9.2% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 89 | 20.0% |
| Q | 41 | 9.2% |
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 701 | 702 | 1 | 1 | Silverthorne, Mr. Spencer Victor | male | 35.0 | 0 | 0 | PC 17475 | 26.2875 | E24 | S |
| 542 | 543 | 0 | 3 | Andersson, Miss. Sigrid Elisabeth | female | 11.0 | 4 | 2 | 347082 | 31.2750 | NaN | S |
| 389 | 390 | 1 | 2 | Lehmann, Miss. Bertha | female | 17.0 | 0 | 0 | SC 1748 | 12.0000 | NaN | C |
| 879 | 880 | 1 | 1 | Potter, Mrs. Thomas Jr (Lily Alexenia Wilson) | female | 56.0 | 0 | 1 | 11767 | 83.1583 | C50 | C |
| 18 | 19 | 0 | 3 | Vander Planke, Mrs. Julius (Emelia Maria Vandemoortele) | female | 31.0 | 1 | 0 | 345763 | 18.0000 | NaN | S |
| 212 | 213 | 0 | 3 | Perkin, Mr. John Henry | male | 22.0 | 0 | 0 | A/5 21174 | 7.2500 | NaN | S |
| 864 | 865 | 0 | 2 | Gill, Mr. John William | male | 24.0 | 0 | 0 | 233866 | 13.0000 | NaN | S |
| 806 | 807 | 0 | 1 | Andrews, Mr. Thomas Jr | male | 39.0 | 0 | 0 | 112050 | 0.0000 | A36 | S |
| 752 | 753 | 0 | 3 | Vande Velde, Mr. Johannes Joseph | male | 33.0 | 0 | 0 | 345780 | 9.5000 | NaN | S |
| 425 | 426 | 0 | 3 | Wiseman, Mr. Phillippe | male | NaN | 0 | 0 | A/4. 34244 | 7.2500 | NaN | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 689 | 690 | 1 | 1 | Madill, Miss. Georgette Alexandra | female | 15.0 | 0 | 1 | 24160 | 211.3375 | B5 | S |
| 7 | 8 | 0 | 3 | Palsson, Master. Gosta Leonard | male | 2.0 | 3 | 1 | 349909 | 21.0750 | NaN | S |
| 348 | 349 | 1 | 3 | Coutts, Master. William Loch "William" | male | 3.0 | 1 | 1 | C.A. 37671 | 15.9000 | NaN | S |
| 738 | 739 | 0 | 3 | Ivanoff, Mr. Kanio | male | NaN | 0 | 0 | 349201 | 7.8958 | NaN | S |
| 821 | 822 | 1 | 3 | Lulic, Mr. Nikola | male | 27.0 | 0 | 0 | 315098 | 8.6625 | NaN | S |
| 620 | 621 | 0 | 3 | Yasbeck, Mr. Antoni | male | 27.0 | 1 | 0 | 2659 | 14.4542 | NaN | C |
| 309 | 310 | 1 | 1 | Francatelli, Miss. Laura Mabel | female | 30.0 | 0 | 0 | PC 17485 | 56.9292 | E36 | C |
| 794 | 795 | 0 | 3 | Dantcheff, Mr. Ristiu | male | 25.0 | 0 | 0 | 349203 | 7.8958 | NaN | S |
| 47 | 48 | 1 | 3 | O'Driscoll, Miss. Bridget | female | NaN | 0 | 0 | 14311 | 7.7500 | NaN | Q |
| 224 | 225 | 1 | 1 | Hoyt, Mr. Frederick Maxfield | male | 38.0 | 1 | 0 | 19943 | 90.0000 | C93 | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 714 | 715 | 0 | 2 | Greenberg, Mr. Samuel | male | 52.0 | 0 | 0 | 250647 | 13.0000 | NaN | S |
| 166 | 167 | 1 | 1 | Chibnall, Mrs. (Edith Martha Bowerman) | female | NaN | 0 | 1 | 113505 | 55.0000 | E33 | S |
| 116 | 117 | 0 | 3 | Connors, Mr. Patrick | male | 70.5 | 0 | 0 | 370369 | 7.7500 | NaN | Q |
| 167 | 168 | 0 | 3 | Skoog, Mrs. William (Anna Bernhardina Karlsson) | female | 45.0 | 1 | 4 | 347088 | 27.9000 | NaN | S |
| 517 | 518 | 0 | 3 | Ryan, Mr. Patrick | male | NaN | 0 | 0 | 371110 | 24.1500 | NaN | Q |
| 193 | 194 | 1 | 2 | Navratil, Master. Michel M | male | 3.0 | 1 | 1 | 230080 | 26.0000 | F2 | S |
| 285 | 286 | 0 | 3 | Stankovic, Mr. Ivan | male | 33.0 | 0 | 0 | 349239 | 8.6625 | NaN | C |
| 873 | 874 | 0 | 3 | Vander Cruyssen, Mr. Victor | male | 47.0 | 0 | 0 | 345765 | 9.0000 | NaN | S |
| 868 | 869 | 0 | 3 | van Melkebeke, Mr. Philemon | male | NaN | 0 | 0 | 345777 | 9.5000 | NaN | S |
| 265 | 266 | 0 | 2 | Reeves, Mr. David | male | 36.0 | 0 | 0 | C.A. 17248 | 10.5000 | NaN | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 54 | 55 | 0 | 1 | Ostby, Mr. Engelhart Cornelius | male | 65.00 | 0 | 1 | 113509 | 61.9792 | B30 | C |
| 468 | 469 | 0 | 3 | Scanlan, Mr. James | male | NaN | 0 | 0 | 36209 | 7.7250 | NaN | Q |
| 42 | 43 | 0 | 3 | Kraeff, Mr. Theodor | male | NaN | 0 | 0 | 349253 | 7.8958 | NaN | C |
| 733 | 734 | 0 | 2 | Berriman, Mr. William John | male | 23.00 | 0 | 0 | 28425 | 13.0000 | NaN | S |
| 637 | 638 | 0 | 2 | Collyer, Mr. Harvey | male | 31.00 | 1 | 1 | C.A. 31921 | 26.2500 | NaN | S |
| 486 | 487 | 1 | 1 | Hoyt, Mrs. Frederick Maxfield (Jane Anne Forby) | female | 35.00 | 1 | 0 | 19943 | 90.0000 | C93 | S |
| 540 | 541 | 1 | 1 | Crosby, Miss. Harriet R | female | 36.00 | 0 | 2 | WE/P 5735 | 71.0000 | B22 | S |
| 756 | 757 | 0 | 3 | Carlsson, Mr. August Sigfrid | male | 28.00 | 0 | 0 | 350042 | 7.7958 | NaN | S |
| 354 | 355 | 0 | 3 | Yousif, Mr. Wazli | male | NaN | 0 | 0 | 2647 | 7.2250 | NaN | C |
| 755 | 756 | 1 | 2 | Hamalainen, Master. Viljo | male | 0.67 | 1 | 1 | 250649 | 14.5000 | NaN | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||